Search CORE

83 research outputs found

A Graphical Model Formulation of Collaborative Filtering Neighbourhood Methods with Fast Maximum Entropy Training

Author: Caetano Tiberio
Defazio Aaron
Publication venue
Publication date: 01/01/2012
Field of study

Item neighbourhood methods for collaborative filtering learn a weighted graph over the set of items, where each item is connected to those it is most similar to. The prediction of a user's rating on an item is then given by that rating of neighbouring items, weighted by their similarity. This paper presents a new neighbourhood approach which we call item fields, whereby an undirected graphical model is formed over the item graph. The resulting prediction rule is a simple generalization of the classical approaches, which takes into account non-local information in the graph, allowing its best results to be obtained when using drastically fewer edges than other neighbourhood approaches. A fast approximate maximum entropy training method based on the Bethe approximation is presented, which uses a simple gradient ascent procedure. When using precomputed sufficient statistics on the Movielens datasets, our method is faster than maximum likelihood approaches by two orders of magnitude.Comment: ICML201

arXiv.org e-Print Archive

CiteSeerX

Submodular Multi-Label Learning

Author: Caetano Tiberio
Petterson James
Publication venue: Neural Information Processing Systems Foundation
Publication date: 08/12/2015
Field of study

In this paper we present an algorithm to learn a multi-label classifier which attempts at directly optimising the F-score. The key novelty of our for-mulation is that we explicitly allow for assortative (submodular) pairwise label interactions, i.e., we can leverage the co-ocurrence of pairs of labels in order to improve the quality of prediction. Prediction in this model consists of minimising a particular submodular set function, what can be accomplished exactly and efficiently via graph-cuts. Learning however is substantially more involved and requires the solution of an intractable com-binatorial optimisation problem. We present an approximate algorithm for this problem and prove that it is sound in the sense that it never predicts incorrect labels. We also present a nontrivial test of a sufficient condition for our algorithm to have found an optimal solution. We present exper-iments on benchmark multi-label datasets, which attest the value of the proposed technique. We also make available source code that enables the reproduction of our experiments.

CiteSeerX

The Australian National University

A convex formulation for learning scale-free networks via submodular relaxation

Author: Caetano Tiberio
Defazio Aaron
Publication venue: Neural Information Processing Systems Foundation
Publication date: 24/02/2016
Field of study

A key problem in statistics and machine learning is the determination of network structure from data. We consider the case where the structure of the graph to be reconstructed is known to be scale-free. We show that in such cases it is natural to formula

The Australian National University

Approximating the problem, not the solution: An alternative view of point set matching

Author: Caelli Terry
Caetano Tiberio
Publication venue: 'Elsevier BV'
Publication date: 08/12/2015
Field of study

This work discusses the issue of approximation in point set matching. In general, one may have two classes of approximations when tackling a matching problem: (1) an algorithmic approximation which consists in using suboptimal procedures to infer the assignment, and (2), a representational approximation which involves a simplified and suboptimal model for the original data. Matching techniques have typically relied on the first approach by retaining the complete model and using suboptimal techniques to solve it. In this paper, we show how a technique based on using exact inference in simple Graphical Models, an instance of the second class, can significantly outperform instances of techniques from the first class. We experimentally compare this method with well-known Spectral and Relaxation methods, which are exemplars of the first class. We have performed experiments with synthetic and real-world data sets which reveal significant performance improvement in a wide operating range

The Australian National University